We learned how the basic Paxos algorithm works with many examples. This lesson revisits many Paxos rules and explores the rationale behind those decisions.

The three roles#

This design choice is very intuitive. Let’s continue with the example of deciding on a food place for Lunch. Some members need to propose some suggestions; otherwise, we would not have any options to consider. The members who give suggestions can be called proposers. Because we need to choose one place from the total list of suggestions, there should be some mechanism that can help us decide. One way to do that is through voting. Then, we need voters that can vote against the proposed suggestions. We’ll call these members of the group the acceptors.

We should also consider a situation where some of the members are not available or have no issue with whatever the final decision is. They just learn the final decision and are happy with that. Also, all the available members of the group need to learn the final chosen value. We call these members of the group the learners.

svg viewer

Here, we might think that the final chosen value would be the one that got the maximum number of votes. But would that work? Let's see in the following section.

Getting majority votes#

We saw in the previous lesson how in the Paxos protocol, a value is chosen only if it gets majority votes. We might wonder why we need a majority (more than half of the total number of votes) when instead we can just choose the value with the highest number of votes. Let's state the difference between having the highest number of votes and having majority votes in the following way to understand why Paxos needs to get majority votes to choose a value.

  • The highest number of votes: Let’s consider that there are a total of 6 members of the group. Three members voted for proposal 1, two voted for proposal 2, and the remaining one voted for proposal 3. Here, the proposal that got the highest number of votes is proposal 1. None of the proposals got majority votes because the majority consists of more than half of the total group members, which is 4 out of 6 in this example.
Choosing a value upon which a large number of members of the group raise their hand
Choosing a value upon which a large number of members of the group raise their hand
  • Majority votes: Consider the same situation, that there are a total of 6 members of the group. Four members voted for proposal 1, while two voted for proposal 2. Here, the proposal that got majority votes is proposal 1.

    Note: The majority comprises more than half of the total members of the group. In real systems, the total number of the group is usually an odd number so we can’t have an even split of votes.

Choosing a value upon which majority members of the group raise their hand
Choosing a value upon which majority members of the group raise their hand

What could be the problem with getting the highest number of votes to choose the final value? This approach can result in a tie. Multiple proposals can get the same number of votes. And in such a situation, we cannot choose a single value. To avoid this issue, we use the majority votes approach.

Obtaining a majority of votes facilitates the selection of a single value because any two majorities share at least one acceptor. This is applicable if an acceptor can only accept one value; that’s why we have a value selection condition in the Paxos protocol that needs to be followed by the proposers while sending an accept request. This condition guarantees that an acceptor will accept at most one value.

Note: Taking majority votes enables us to stick with a decision once a decision has been made.

Point to ponder

Question

Why don’t we designate a single acceptor responsible for choosing the first proposed value that it receives?

Hide Answer

This is an easy solution to choose a value, but it is unreliable. The failure of the single acceptor makes it impossible to make progress and choose the value.

A single acceptor’s failure probability is higher than that of a majority set of acceptors.

The two-step protocol#

Why does the Paxos protocol consist of two phases? Can't we reach a consensus in a single step (requesting the majority to accept a value)?

We will start with the contradiction, assuming that the proposers are directly proposing the value and requesting the acceptors accept that value. And we will then examine the need that led to adding a prepare phase before the accept phase. Here, acceptors will follow the rule of accepting the first proposal they see and then never changing it.

Created with Fabric.js 3.6.6
There are four members of the Paxos group, A, B, C, and D, where each member plays all the roles

1 of 9

Created with Fabric.js 3.6.6
Proposer A proposes a value val_A against proposal number 1 to all the acceptors, including itself

2 of 9

Created with Fabric.js 3.6.6
Two acceptors, A and B, accept the value proposed by A and write to the disk, while two acceptors C, and D, fail before accepting the value

3 of 9

Created with Fabric.js 3.6.6
Proposer A receives acceptance of its proposed value from 2 out of 4 members of the group. Since 2 votes is not a majority, val_A can't be chosen.

4 of 9

Created with Fabric.js 3.6.6
C and D recover, but before proposer A tries again to get its value accepted by the majority, it itself fails

5 of 9

Created with Fabric.js 3.6.6
Now, C proposes a value val_C to all the available acceptors in the group B, C, and D

6 of 9

Created with Fabric.js 3.6.6
B rejects proposal 2 with val_C issued by C because it has already accepted a value. It can't accept more than one value, while C and D accept val_B.

7 of 9

Created with Fabric.js 3.6.6
C gets its proposed value val_C accepted by two acceptors, C and D. Since 2 is not the majority from a group of 4 members, val_C can't be chosen

8 of 9

Created with Fabric.js 3.6.6
None of the proposers get their proposed value accepted by a majority of the acceptors so no value is chosen. Also, there is no way acceptors change their decision, so any new proposal can't get accepted, and therefore a value can never be chosen.

9 of 9

Suppose we let the acceptors change their decision by allowing them to accept the latest request and discard the old one. In that case, multiple proposals (every new proposal) can get the majority on a proposed value of their own choice and choose that value. This leads to multiple and different values being chosen after a value has already been chosen. This also violates the safety property.

So knowing about the proposals already in motion before proposing a new proposal with a new value is a must for safety as well as liveness. Therefore, there is a two-phase protocol where the proposers discover the already accepted proposal in the first phase and then wisely propose the value from the values already in motion. This helps choose a single value. The safety property can not be violated this way. This protocol is shown in the following illustration:

Created with Fabric.js 3.6.6
There are four members of the Paxos group, A, B, C, and D, where each member plays all the roles

1 of 15

Created with Fabric.js 3.6.6
A discovers the already accepted values by sending a prepare request with proposal number 1 to all the acceptors

2 of 15

Created with Fabric.js 3.6.6
A receives promises on proposal number 1 from majority acceptors A, B, and C, but it has not received any accepted value in response from the acceptors, which means those acceptors have not accepted any value till now

3 of 15

Created with Fabric.js 3.6.6
A proposes a value of its own choice, val_A, by sending an accept request to the majority acceptors, A, B, and C, from whom it received promises

4 of 15

Created with Fabric.js 3.6.6
Two acceptors, A and B, accept val_A, but the third acceptor (C), that promised to proposer A on proposal number 1 fails before accepting its value

5 of 15

Created with Fabric.js 3.6.6
One of the majority acceptors that promised to A fails before accepting the value val_A proposed by A

6 of 15

Created with Fabric.js 3.6.6
A wasn’t able to get its value accepted by the majority 3 out of 4 group members because of the failure of C

7 of 15

Created with Fabric.js 3.6.6
The failed group members, C and D, recover. But A, which was trying to get its proposal accepted and chosen, fails now

8 of 15

Created with Fabric.js 3.6.6
Now C takes the turn and initiates its proposal by first discovering the already accepted values by sending a prepare request to all the available acceptors, B, C, and D

9 of 15

Created with Fabric.js 3.6.6
C receives promises on proposal number 2 from the majority acceptors B, C, and D. Among the responses to the prepare request, C receives an accepted value, val_A, against proposal number 1

10 of 15

Created with Fabric.js 3.6.6
C is bound to choose the value with the highest proposal number in the prepare request responses it received from the acceptors. There was only one accepted value with proposal number 1, so C proposes val_A with proposal number 2 to the acceptors that promised to it.

11 of 15

Created with Fabric.js 3.6.6
B updates the proposal number against val_A, while C and D accept val_A on proposal number 2 and write to the disk

12 of 15

Created with Fabric.js 3.6.6
C receives accepted responses from 3 out of 4 group member B, C, and D on val_A for proposal number 2

13 of 15

Created with Fabric.js 3.6.6
3 out of 4 is the majority

14 of 15

Created with Fabric.js 3.6.6
C, by knowing that it received the acceptance on val_A from the majority, chooses val_A. And once the value is chosen, all group members learn the chosen value.

15 of 15

Rules followed by the acceptors#

As we saw in the previous lesson, the following rules need to be followed by the acceptors. They are easy to understand and intuitive.

  • Accepting at least one proposal: It ensures that a value can be chosen where we have only one proposal. Therefore, the acceptors should always accept the first proposal they receive. If we don't make it necessary, that can lead to a situation where no value is chosen.

  • Promising a proposal with a higher proposal number: It ensures that we include every new proposal in the final decision.

Rules followed by the proposers#

The proposers send their proposals according to the rules below:

  • A new proposal can be proposed with a large number: It helps us because we do not keep track of every proposal; we just need to remember the latest proposal.

  • Value selection condition at the proposers: It ensures the safety property by choosing a single value. The proposer first finds out the values already accepted by the acceptors in the majority quorum, then chooses the value with the highest proposal number and proposes that value.

Liveness with the distinguished proposer#

A distinguished proposer helps make progress and eventually chooses a value (the goal of the Paxos). It works under the assumption that a good part (more than half) of the system—which includes the proposer, acceptors, and communication network—is operating correctly.

svg viewer

We simulated the Paxos protocol using the distinguished proposer in the previous lesson. However, that was the ideal situation when the whole system was working correctly, and there was no failure. We will now simulate the protocol in the presence of some failures and see different scenarios.

No progress scenario#

A majority set of acceptors should be available at any time. This is the assumption by which the Paxos protocol works. Still, we can get into a situation when we can’t make progress even if the majority set of acceptors is available; this occurs when we have a different list of acceptors in the majority set at different times. The majority set of acceptors that was available at the time of prepare request was not the same at the time of the accept request. And more precisely, if we have the acceptors fail and recover in the same pattern repeatedly, then we can’t make progress.HTML5 Icon

For example, if the acceptors {A, B} from the majority set of available acceptors promised to a prepare request on proposal ID “k” and one of the acceptors (for example, B) failed when an accept request for proposal "k" was sent to the acceptors that promised on “k” earlier, then we wouldn't be able to get majority acceptance for proposal “k” and the corresponding value in the request, despite having a new majority set of acceptor {A, C} at that time.

The proposer will try again with another proposal number. If the above same situation happens again and again with each new proposal, we won't reach a consensus. This is shown in the following illustration:

Note: The next two slide decks are very similar to the ones we saw earlier about the dueling proposers problem and the distinguished proposer solution.

Created with Fabric.js 3.6.6
Consider there are three members of the group, A, B, and C, and each member is performing all the roles

1 of 23

Created with Fabric.js 3.6.6
To avoid a dueling proposer situation, one proposer is selected to propose at a single time and it is called the distinguished proposer

2 of 23

Created with Fabric.js 3.6.6
Let's assume that A is selected as the distinguished proposer

3 of 23

Created with Fabric.js 3.6.6
Proposer A requests the incremental unique ID generator to generate a proposal number

4 of 23

Created with Fabric.js 3.6.6
Proposer A requests the incremental unique ID generator to generate a proposal number

5 of 23

Created with Fabric.js 3.6.6
Proposer A gets proposal number 1

6 of 23

Created with Fabric.js 3.6.6
Proposer A's proposal identity is 1

7 of 23

Created with Fabric.js 3.6.6
Proposer A needs promises from majority acceptors on proposal ID 1 for which it needs to send prepare requests to the acceptors

8 of 23

Created with Fabric.js 3.6.6
Proposer A sends a prepare request to all acceptors, and the majority set of available acceptors at this time is {A, B}

9 of 23

Created with Fabric.js 3.6.6
The majority set of available acceptors {A, B} promises on the proposal ID 1

10 of 23

Created with Fabric.js 3.6.6
Proposer A receives two promises, which is a majority (2 out of 3)

11 of 23

Created with Fabric.js 3.6.6
With the majority of promises, proposer A is eligible to enter phase 2 of the protocol

12 of 23

Created with Fabric.js 3.6.6
As proposer A steps into the phase 2 of the protocol, the acceptor B fails

13 of 23

Created with Fabric.js 3.6.6
Proposer A sends accept requests to the acceptors {A, B} that promised on proposal ID 1 with a value (val_A) of its own choice because it received no values in response to prepare requests

14 of 23

Created with Fabric.js 3.6.6
Proposer A receives acceptance from acceptor A on val_A, but could't connect with acceptor B because it had failed

15 of 23

Created with Fabric.js 3.6.6
Proposer A couldn't make val_A (proposed against proposal ID 1) accepted by the majority acceptors that promised on the proposal earlier

16 of 23

Created with Fabric.js 3.6.6
Proposer A is going to try again with a new proposal for which it first requests the incremental unique ID generator for another high-numbered proposal ID

17 of 23

Created with Fabric.js 3.6.6
Proposer A now has a proposal with ID 2

18 of 23

Created with Fabric.js 3.6.6
Proposer A sends prepare requests with the proposal ID 2 to all acceptors

19 of 23

Created with Fabric.js 3.6.6
Acceptor C is not available, but we have the majority set of acceptors {A, B} available that promise to the proposer A on proposal ID 2

20 of 23

Created with Fabric.js 3.6.6
Proposer A receives a majority of promises and a value already accepted against proposal ID 1, and is ready to enter the phase 2 of the protocol

21 of 23

Created with Fabric.js 3.6.6
Phase 2 starts and the proposer A sends an accept request for proposal ID 2 on val_A ( as per value selection condition) to acceptors {A, B} that promised on proposal ID 2

22 of 23

Created with Fabric.js 3.6.6
Only one of the two acceptors accept the value because the other acceptor has failed, and we again fail to choose a value

23 of 23

Progress under the assumption of the same majority set#

Progress can only be made and a value can only be chosen when we have the same set of majority acceptors remaining available at the time of prepare requests as well as the following accept requests for at least one of the proposals, as shown in the following illustration. HTML5 Icon

Created with Fabric.js 3.6.6
Consider there are three members of the group, A, B, and C, and each member is performing all the roles

1 of 26

Created with Fabric.js 3.6.6
Let's assume that A is selected as the distinguished proposer

2 of 26

Created with Fabric.js 3.6.6
Proposer A requests the incremental unique ID generator to generate a proposal number

3 of 26

Created with Fabric.js 3.6.6
Proposer A requests the incremental unique ID generator to generate a proposal number

4 of 26

Created with Fabric.js 3.6.6
Proposer A gets proposal number 1

5 of 26

Created with Fabric.js 3.6.6
Proposer A's proposal identity is 1

6 of 26

Created with Fabric.js 3.6.6
Proposer A needs promises from majority acceptors on proposal ID 1 for which it needs to send prepare requests to the acceptors

7 of 26

Created with Fabric.js 3.6.6
Proposer A sends a prepare request to all acceptors, majority set of available acceptors at this time is {A, B}

8 of 26

Created with Fabric.js 3.6.6
The majority set of available acceptors {A, B} promise on the proposal ID 1

9 of 26

Created with Fabric.js 3.6.6
Proposer A receives two promises, which is a majority (2 out of 3)

10 of 26

Created with Fabric.js 3.6.6
By having a majority of promises, proposer A is eligible to enter phase 2 of the protocol

11 of 26

Created with Fabric.js 3.6.6
As the proposer A steps into the phase 2 of the protocol, the acceptor B fails

12 of 26

Created with Fabric.js 3.6.6
Proposer A sends accept requests to the acceptors {A, B} that promised on proposal ID 1 with a value (val_A) of its own choice because it received no values in response to prepare requests

13 of 26

Created with Fabric.js 3.6.6
Proposer A receives acceptance from acceptor A on val_A, but could't connect with acceptor B because it had failed

14 of 26

Created with Fabric.js 3.6.6
Proposer A couldn't get val_A (which was proposed against proposal ID 1) accepted by the majority acceptors that promised on the proposal earlier

15 of 26

Created with Fabric.js 3.6.6
Proposer A is going to try again with a new proposal for which it first requests the incremental unique ID generator for another high-numbered proposal ID

16 of 26

Created with Fabric.js 3.6.6
Proposer A now have a proposal with ID 2

17 of 26

Created with Fabric.js 3.6.6
Proposer A sends prepare requests with proposal ID 2 to all acceptors

18 of 26

Acceptor C is not available, but we have the majority set of acceptors {A, B} available that promise to the proposer A on proposal ID 2

19 of 26

Created with Fabric.js 3.6.6
Proposer A receives a majority of promises and a value that has already been accepted against proposal ID 1, and is ready to enter the phase 2 of the protocol

20 of 26

Phase 2 starts and the proposer A sends an accept request for proposal ID 2 on val_A ( as per value selection condition) to acceptors {A, B} that promised on proposal ID 2

21 of 26

Created with Fabric.js 3.6.6
Both acceptors accept the value

22 of 26

Created with Fabric.js 3.6.6
The value is accepted by majority and the proposer A receives these responses, and it learns that val_A is chosen. The acceptors also send the accepted responses to the distinguished set of learners.

23 of 26

There are two distinguished learners (the majority) where A is acting as proposer as well as learner, they have learned the final chosen value by counting the accepted responses they received for the same proposal ID

24 of 26

Created with Fabric.js 3.6.6
The distinguished proposers forward the final chosen value to other learners in the group

25 of 26

Created with Fabric.js 3.6.6
Everyone on the group, A, B, and C has learned the final chosen value

26 of 26

Handling node failures in majority quorum#

Paxos tolerates the failures of less than half of the total nodes in the cluster. If a proposer gets its proposal promised by a majority of the acceptors, it sends an accept request with a proposed value. If a proposer gets its proposed value accepted by a majority of the acceptors (the majority quorum), then that value is chosen, and no other value is chosen afterward. The protocol execution during partial failures is shown below:

Created with Fabric.js 3.6.6
There is a Paxos group of five members. A completes the first phase by getting promises from a majority quorum {A, B, C}, assuming the light green boxes to be the members that are not available during the first phase. In the second phase, A gets the value V1 accepted by itself (it also writes the value to the disk) but fails to get it accepted by B and C due to the failure of B and C. The first phase is not shown in the slides to keep things simple.

1 of 6

Created with Fabric.js 3.6.6
Now B proposes a value V2 of its own choice (because none of the quorum members have accepted any value before) to a majority quorum {B, C, D} and gets its value accepted by itself, and writes the value to the disk, but it fails to get the value accepted by C and D

2 of 6

Created with Fabric.js 3.6.6
Now C proposes a value V3 of its own choice (because none of the quorum members have accepted any value before) to a majority quorum {C, D, E} and gets its value accepted by itself, and writes the value to the disk, but it fails to get the value accepted by D and E

3 of 6

Created with Fabric.js 3.6.6
Now D being a proposer starts its first phase and gets promises from the majority quorum {D, E, A}, and as a result of the first phase, it finds already accepted values by the quorum members (the first phase is not shown in the slide). In the second phase, D proposes a value V1 with the highest proposal number from the known accepted values in its quorum.

4 of 6

Created with Fabric.js 3.6.6
The proposer E in the fifth round proposes value V2 because it is the value of the highest proposal number accepted by the acceptor B in the quorum {E, A, B}, and the acceptor E accepts this value against proposal number 5 in the quorum

5 of 6

Created with Fabric.js 3.6.6
Now depending upon the specific nodes in the majority quorum different values can be chosen. For example, if blue nodes are in the quorum, then V2 will be chosen. If orange nodes are in the majority quorum, then V1 will be chosen.

6 of 6

In this lesson, we learned about the basic Paxos protocol and the rationale behind the algorithm decision. In the next lesson, we will learn about multi-Paxos, an extension of the basic Paxos algorithm.

Basic Paxos in Action

Multi-Paxos